Anonymizing Machine Learning Models
نویسندگان
چکیده
There is a known tension between the need to analyze personal data drive business and privacy concerns. Many protection regulations, including EU General Data Protection Regulation (GDPR) California Consumer Act (CCPA), set out strict restrictions obligations on collection processing of data. Moreover, machine learning models themselves can be used derive information, as demonstrated by recent membership attribute inference attacks. Anonymized data, however, exempt from in these regulations. It therefore desirable able create that are anonymized, thus also exempting them those obligations, addition providing better against Learning anonymized typically results significant degradation accuracy. In this work, we propose method achieve model accuracy using knowledge encoded within trained model, guiding our anonymization process minimize impact model's accuracy, call accuracy-guided anonymization. We demonstrate focusing rather than generic information loss measures, outperforms state art k-anonymity methods terms achieved utility, particular with high values k large numbers quasi-identifiers. approach has similar, sometimes even ability prevent attacks approaches based differential privacy, while averting some their drawbacks such complexity, performance overhead model-specific implementations. This makes model-guided legitimate substitute for practical creating privacy-preserving models.
منابع مشابه
Dust source mapping using satellite imagery and machine learning models
Predicting dust sources area and determining the affecting factors is necessary in order to prioritize management and practice deal with desertification due to wind erosion in arid areas. Therefore, this study aimed to evaluate the application of three machine learning models (including generalized linear model, artificial neural network, random forest) to predict the vulnerability of dust cent...
متن کاملMachine Learning Models for Housing Prices Forecasting using Registration Data
This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...
متن کاملDebugging Machine Learning Models
Creating a machine learning solution for a real world problem often becomes an iterative process of training, evaluation and improvement where the best practices and generic solutions are few and far between. Our work presents a novel solution for an essential step of this cycle: the process of understanding the root causes of ’bugs’ (particularly consequential or confusing test errors) discove...
متن کاملCalibration of Machine Learning Models
The evaluation of machine learning models is a crucial step before their application because it is essential to assess how well a model will behave for every single case. In many real applications, not only is it important to know the “total” or the “average” error of the model, it is also important to know how this error is distributed and how well confidence or probability estimations are mad...
متن کاملSemantic models for machine learning
In this thesis we present approaches to the creation and usage of semantic models by the analysis of the data spread in the feature space. We aim to introduce the general notion of using feature selection techniques in machine learning applications. The applied approaches obtain new feature directions on data, such that machine learning applications would show an increase in performance. We rev...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2022
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-030-93944-1_8